Since the solution set obtained by 3-approximation algorithm of cluster vertex deletion problem is probably big, a new approximation algorithm was proposed through analyzing the characteristics of cluster. In the new algorithm, the number of P3 related to each vertex in the graph was counted by examining first-order neighbors and second-order neighbors of each vertex, and then the vertex with maximum number of P3 was selected into the solution set to eliminate P3 as soon as possible, which led to a smaller vertex deletion set. In order to verify the performance of this new algorithm, several sets of randomized simulation were designed. The simulation results show that the new algorithm outperforms the classic 3-approximation algorithm.
The data link capacity of 1090ES (1090 MHz Extended Squitter) could be expended by modulating 1090 MHz signal with phase information, thus RS (Reed-Solomon) calibration technology of 1090ES expansion system based on 8PSK (8 Phase Shift Keying) phase modulation was studied. Firstly, the total length of the RS code symbols was designed as 54 according to the characteristics of RS code and the data link structure of 1090ES expansion system. Secondly, error performance with different RS code coding efficiency was discussed, and its influence on performance of the 1090ES expansion system was analyzed, thereby, the optimum selection of RS code coding efficiency range was determined as 0.6-0.7. Finally, the concrete analysis of the error performance in the selected encoding efficiency range was given, and then the experimental results show that the length of information symbols could be chosen as 32, 34 or 36. Furthermore, Matlab simulation analysis shows that the designed RS code can effectively improve the error performance of 1090ES expansion system with RS(54, 32) as an example.
Focusing on the issue that the label kernel functions do not take the correlation between labels into consideration in the multi-label feature extraction method, two construction methods of new label kernel functions were proposed. In the first method, the multi-label data were transformed into single-label data, and thus the correlation between labels could be characterized by the label set; then a new label kernel function was defined from the perspective of loss function of single-label data. In the second method, mutual information was used to characterize the correlation between labels, and a new label kernel function was proposed from the perspective of mutual information. Experiments on three real-life data sets using two multi-label classifiers demonstrated that the best method of all measures was feature extraction method with label kernel function based on loss function and the performance of five evaluation measures on average increased by 10%; especially on the data set Yeast, the evaluation measure Coverage reached a decline of about 30%. Closely followed by feature extraction method with label kernel function based on mutual information and the performance of five evaluation measures on average increased by 5%. The theoretical analysis and simulation results show that the feature extraction methods based on new output kernel functions can effectively extract features, simplify learning process of multi-label classifiers and, moreover, improve the performance of multi-label classification.
The null models of complex networks generated by random scrambling algorithm often can't tell when null models can be stable because of the difference of successful scrambling probabilities of different order null models. Focusing on the issue, the concept of "successful scrambling times" was defined and used to replace the usual "try scrambling times" to set the algorithm. The index of the proposed successful scrambling times could be added only when the randomly selected edges could meet the scrambling conditions of corresponding null models, and thus be successfully scrambled. The generation experiments of null models of every order show that every index can be stable in a small scale of successful scrambling times. Further quantitative analyses show that, according to the corresponding orders, 0-order, 1-order and 2-order null models with good quality can be got by setting successfully scrambling times to be 2 times, 1 times and 1 times of actual networks' edge number respectively.
Concerning of the low accurate rate of active defense technology, a heuristic detection system of Trojan based on the analysis of trajectory was proposed. Two kinds of typical Trojan trajectories were presented, and by using the behavioral data on Trojan trajectory the danger level of the suspicious file was detected with the decision rules and algorithm. The experimental results show that the performance of detecting unknown Trojan of this system is better than that of the traditional method, and some special Trojans can also be detected.
The access request for computer network has the characteristics of real-time and dynamic change. In order to detect network intrusion in real time and be adapted to the dynamic change of network access data, a real-time detection framework for network intrusion was proposed based on data stream. First of all, misuse detection model and anomaly detection model were combined. A knowledge base was established by the initial clustering which was made up of normal patterns and abnormal patterns. Secondly, the similarity between network access data and normal pattern and abnormal pattern was measured using the dissimilarity between data point and data cluster, and the legitimacy of network access data was determined. Finally, when network access data stream evolved, the knowledge base was updated by reclustering to reflect the state of network access. Experiments on intrusion detection dataset KDDCup99 show that, when initial clustering samples are 10000, clustering samples in buffer are 10000, adjustment coefficient is 0.9, the proposed framework achieves a recall rate of 91.92% and a false positive rate of 0.58%. It approaches the result of the traditional non-real-time detection model, but the whole process of learning and detection only scans network access data once. With the introduction of knowledge base update mechanism, the proposed framework is more advantageous in the real-time performance and adaptability of intrusion detection.
Concerning the low accuracy of tagging Chinese ambiguity words, a combined tagging method of rules and statistical model was proposed in this paper. Firstly, three kinds of traditional statistical models, including Hidden Markov Model (HMM), Maximum Entropy (ME) and Condition Random Field (CRF), were used to tagging problem of the ambiguity words. Then, the improved mutual information algorithm was applied to learn Part Of Speech (POS) tagging rules. Tagging rules were got through the calculation of correlation between the target words and the nearby word units. Finally, rules were combined with statistical model algorithm to tag Chinese ambiguity words. The experimental results show that after adding the rule algorithm, the average accuracy of POS tagging promotes by 5%.
Building an interpretable and large-scale protein-compound interactions model is an very important subject. A new chemical interpretable model to cover the protein-compound interactions was proposed. The core idea of the model is based on the hypothesis that a protein-compound interaction can be decomposed as protein fragments and compound fragments interactions, so composing the fragments interactions brings about a protein-compound interaction. Firstly, amino acid oligomer clusters and compound substructures were applied to describe protein and compound respectively. And then the protein fragments and the compound fragments were viewed as the two parts of a bipartite graph, fragments interactions as the edges. Based on the hypothesis, the protein-compound interaction is determined by the summation of protein fragments and compound fragments interactions. The experiment demonstrates that the model prediction accuracy achieves 97% and has the very good explanatory.
The near-surface defects are hard to identify in ultrasonic phased array Non-Destructive Testing (NDT), thus a new intelligent identification method based on fractal theory was proposed to solve this problem. A box-counting dimension algorithm based on linear interpolation was described to calculate the box-counting dimension of 140 groups of ultrasonic A-Scan time domain signals. Then the distribution of box-counting dimension was analyzed using the statistical method. The experimental results show that ultrasonic A-Scan signal is obviously fractal and it is effective to analyze the A-Scan signal with the fractal approach. This method has the potential to identify near-surface defects since the values of the box counting dimension of defective signals are different from those of defective signals. As a result, the detection rate of near-surface defects can be improved and the omission rate caused by man-made factors can be reduced in ultrasonic phased array automatic testing.
In integrated support engineering, the number of components in reliability block diagram is large, the level of mastering the principle of system is required to be high and the operational data is always incomplete. To resolve these problems, a method that identifies the reliability structure of system using the information of operational data and the reliability of the units was proposed. The system reliability was estimated by using the system performance information. In addition, all reliability structure models was traversed and the theoretical reliability was calculated with the system's units reliability information, then the deviations between the estimated value of system reliability and all the reliability theoretical values were calculated, and the identification results by the first N reliability structure models of the lowest deviation was outputted after sorting the deviations. The calculation results of a given example show that the combined system based on the voting reliability structure can be identified with the probability of around 80%, decreases to 3% of the scope out of all possible forms, it can significantly reduce the workload of the researcher to identify the system reliability structure.
An approach for planar pattern tracking was introduced. The tracker could locate a planar pattern accurately in real time using a Webcamera. The tracking result could be used for computing camera's position and placing virtual object in the video. Some common tracking failures caused by "motion blur", "fast motion", "small target" were discussed. An algorithm, combined "LK tracking" techniques with "Hungarian algorithm-based feature correspondence", was used to solve such problems. A method based on non-linear optimization is used to suppress the influence of noise, when computing the camera's position. The experiments show that this method can track the planar pattern robustly in some difficult situations. The curve of the camera's computed position is smooth and accurate.
Most of current music visualizations are monotone. In this article, a aquarelle-style music visualization integrated perfectly with music was implemented. It adopts some key technologies, such as aquarelle effect based on alpha channel, tidy physical model, piecemeal joint, and composite anti-aliasing, etc. The example of music visualization by bloom, stalk, dimple and goldfish shows that this method can get good effect with low CPU occupancy and small memory request.
The view maintenance process in the loosely coupled environment was implemented as a conceptual transaction, by the definitions of concurrency dependency and same-source dependency. Based on schema changes, the VMSCNF algorithm was proposed to solve the consistency of view maintenance processes in the non-FIFO network environments. Finally, a preliminary experiment was carried out to verify the efficiency of the proposed algorithm.